-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use BOSS indexing in DBGSuccinct + make RowDiff independent #484
base: master
Are you sure you want to change the base?
Conversation
ecce9eb
to
707f7c2
Compare
Hi @hmusta, can you took a look here? I think it's still far from complete, but I'd still apreciate some input on the general direction to make sure that what I'm doing is aligned with what we want. |
First of all, I'll wait for workflows to run through to make sure it still works as intended on DBGSuccinct. After that, I wonder if there is some simple enough way to test this with some other graph types. Some of my older notes from a few months ago say the following:
Would be nice to check whether it is still relevant. |
We can test this by modifying the following block in
Then, we can add instantiations for other graph types (let's start with DBGHashFast for now) with RowDiff annotations in these places:
|
distance[v] = 0; | ||
(*terminal)[v] = true; | ||
}; | ||
call_nodes([&](node_index v) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we know this works, can we add num_threads
as a parameter to call_nodes
so we can do this with multiple threads?
The set_bit
, fetch_bit
, etc. methods used in BOSS::row_diff_traverse
can be used here for thread-safe atomic operations on sdsl::bit_vector
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, might be best to get back to it after some initial version is landed.
Co-authored-by: Harun Mustafa <[email protected]>
* Update download-artifact * Touch commit for workflows * Update upload-artifact to @v4 * Own artifact name for _noAVX
I hopefully fixed all unit tests. It seems that some integration tests still fail, so I'll look into it further over the next days. In the meantime I think it'd be nice to try to get it reviewed and merged, as the change becomes large and difficult to manage within one PR. Then, the tests for other graphs can be added in a separate PR. |
for (node_index i = 1; i <= max_index() && !terminate(); ++i) { | ||
if (is_valid(i)) { | ||
callback(i); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when it exists, we can probably replace this with valid_edges_->call_ones
, but with a try-catch
to implement terminate
. What do you think? It should be more efficient than calling operator[]
on all elements in the vector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally dislike the idea of using exceptions for control flow. I implemented this approach for now, but I feel like there should be other ways to work with it. For example, encode termination condition with the return value (instead of void
). But it seems like it will require another huge overhaul of the codebase...
On the same note, it seems that some functions use terminate()
for this, and others stop_early()
😔
@@ -260,34 +272,34 @@ TEST(RowDiff, GetAnnotationBifurcation) { | |||
RowDiff<ColumnMajor> annot(&graph, std::move(mat)); | |||
annot.load_anchor(fterm_temp.name()); | |||
|
|||
EXPECT_EQ("CTAG", graph.get_node_sequence(4)); | |||
EXPECT_EQ("CTAG", graph.get_node_sequence(graph.select_node(4))); | |||
ASSERT_THAT(annot.get_rows({3})[0], ElementsAre(0, 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be graph.select_node(4)-1
? Same for the ones below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, eyes, but it's only really needed for masked
version, while without masks it's consecutive indexing, so select_node
is excessive. Do you still want me to add it?
WIP pull request for CI.